Goto

Collaborating Authors

 logistic model


A proposal for PU classification under Non-SCAR using clustering and logistic model

Furmanczyk, Konrad, Paczutkowski, Kacper

arXiv.org Machine Learning

The present study aims to investigate a cluster cleaning algorithm that is both computationally simple and capable of solving the PU classification when the SCAR condition is unsatisfied. A secondary objective of this study is to determine the robustness of the LassoJoint method to perturbations of the SCAR condition. In the first step of our algorithm, we obtain cleaning labels from 2-means clustering. Subsequently, we perform logistic regression on the cleaned data, assigning positive labels from the cleaning algorithm with additional true positive observations. The remaining observations are assigned the negative label. The proposed algorithm is evaluated by comparing 11 real data sets from machine learning repositories and a synthetic set. The findings obtained from this study demonstrate the efficacy of the clustering algorithm in scenarios where the SCAR condition is violated and further underscore the moderate robustness of the LassoJoint algorithm in this context.


Inferring Change Points in Regression via Sample Weighting

Arpino, Gabriel, Venkataramanan, Ramji

arXiv.org Machine Learning

We study the problem of identifying change points in high-dimensional generalized linear models, and propose an approach based on sample-weighted empirical risk minimization. Our method, Weighted ERM, encodes priors on the change points via weights assigned to each sample, to obtain weighted versions of standard estimators such as M-estimators and maximum-likelihood estimators. Under mild assumptions on the data, we obtain a precise asymptotic characterization of the performance of our method for general Gaussian designs, in the high-dimensional limit where the number of samples and covariate dimension grow proportionally. We show how this characterization can be used to efficiently construct a posterior distribution over change points. Numerical experiments on both simulated and real data illustrate the efficacy of Weighted ERM compared to existing approaches, demonstrating that sample weights constructed with weakly informative priors can yield accurate change point estimators. Our method is implemented as an open-source package, weightederm, available in Python and R.









Conformalized matrix completion

Neural Information Processing Systems

If the underlying signal is assumed to be low-rank, a range of estimation algorithms have been proposed in the literature, including approaches based on convex relaxations of rank (Candes and Plan, 2010; Candès and Tao, 2010; Koltchinskii et al., 2011; Foygel and Srebro, 2011; Negahban